首页> 外文OA文献 >GPU-STREAM v2.0:Benchmarking the achievable memory bandwidth of many-core processors across diverse parallel programming models
【2h】

GPU-STREAM v2.0:Benchmarking the achievable memory bandwidth of many-core processors across diverse parallel programming models

机译:GPU-STREAM v2.0:对跨多种并行编程模型的多核处理器可实现的内存带宽进行基准测试

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Many scientific codes consist of memory bandwidth bound kernels | the dominating factor of the runtime is the speed at which data can be loaded from memory into the Arithmetic Logic Units, before results are written back to memory. One major advantage of General Purpose Graphics Processing Units (GPGPUs) and other many-core devices such as the Intel Xeon Phi is their focus on providing increased memory bandwidth over traditional CPU architectures. However, as with CPUs, this peak memory bandwidth is usually unachievable in practice and so benchmarks are required to measure a practical upper bound on expected performance.The choice of one programming model over another should ideally not limit the performance that can be achieved on a device. GPU-STREAM has been updated to incorporate a wide variety of the latest parallel programming models, all implementing the same parallel scheme. As such this tool can be used as a kind of Rosetta Stone which provides both a cross-platform and cross-programming model array of results of achievable memory bandwidth.
机译:许多科学代码由内存带宽绑定内核组成。运行时间的主要因素是在将结果写回到内存之前,数据可以从内存加载到算术逻辑单元的速度。通用图形处理单元(GPGPU)和其他多核设备(例如Intel Xeon Phi)的一大优势是,它们致力于提供比传统CPU架构更多的内存带宽。但是,与CPU一样,在实践中通常无法达到此峰值内存带宽,因此需要基准来衡量预期性能的实际上限。理想情况下,选择一种编程模型而不是另一种编程模型不应限制可以在处理器上实现的性能。设备。 GPU-STREAM已更新,以包含各种最新的并行编程模型,所有模型均实现相同的并行方案。这样,该工具可以用作Rosetta Stone的一种,它提供可实现内存带宽结果的跨平台和跨程序模型数组。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号